From TPC-C to Big Data Benchmarks: A Functional Workload Model
نویسندگان
چکیده
Big data systems help organizations store, manipulate, and derive value from vast amounts of data. Relational database and MapReduce are the two most prominent technologies for such systems. Organizations use them to perform complex analysis on diverse and unconventional data types with fast growing data volumes. As more big data systems are deployed, the industry faces the challenge to develop representative benchmarks that can evaluate the capabilities of competing implementations. In this position paper, we argue for building future big data benchmarks using what we call a “functional workload model”. This concept draws on combined experiences from standard benchmarks, exemplified by TPC-C. The functional workload model describes the functional goals that the system must achieve, the data access patterns, the load variations over time, and the computation required to achieve the functional goals. Abstracting functional workload models from empirical studies of MapReduce deployments represents the first step towards building truly representative big data benchmarks.
منابع مشابه
Stream Processing Systems Have Arrived at the Big Data Party. But Where Are All the Benchmarks?
Stream processing systems have now become an integral part of the Big Data ecosystem. Unfortunately, streaming benchmarks have not followed suit leading to non-representative benchmarking of systems. Benchmarks in general have many use cases including: (a) comparing two or more systems, (b) matching applications and workloads to systems, and (c) configuring and optimizing a system. Due to these...
متن کاملAnalysis of the Characteristics of Production Database Workloads and Comparison with the TPC Benchmarks
There has been very little empirical analysis of any real production database workloads. Although The Transaction Processing Performance Council benchmarks C (TPC-C) and D (TPC-D) have become the standard benchmarks for online transaction processing and decision support systems respectively, there has also not been any major effort to systematically analyze their workload characteristics, espec...
متن کاملWhy You Should Run TPC-DS: A Workload Analysis
The Transaction Processing Performance Council (TPC) is completing development of TPC-DS, a new generation industry standard decision support benchmark. The TPC-DS benchmark, first introduced in the “The Making of TPC-DS” [9] paper at the 32 International Conference on Very Large Data Bases (VLDB), has now entered the TPC’s “Formal Review” phase for new benchmarks; companies and researchers ali...
متن کاملBenchmarking Hybrid OLTP&OLAP Database Systems
Recently, the case has been made for operational or real-time Business Intelligence (BI). As the traditional separation into OLTP database and OLAP data warehouse obviously incurs severe latency disadvantages for operational BI, hybrid OLTP&OLAP database systems are being developed. The advent of the first generation of such hybrid OLTP&OLAP database systems requires means to characterize their...
متن کاملA methodology for auto-recognizing DBMS workloads
The type of the workload on a database management system (DBMS) is a key consideration in tuning the system. Allocations for resources such as main memory can be very different depending on whether the workload type is Online Transaction Processing (OLTP) or Decision Support System (DSS). A DBMS also typically experiences changes in the type of workload it handles during its normal processing c...
متن کامل